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' Abstract. We address in this paper a new computational biology problem that aims at understand- 

_ , ing a mechanism that could potentially be used to genetically manipulate natural insect populations 

■ infected by inherited, intra-cellular parasitic bacteria. In this problem, that we denote by Mod/Resc 
' Parsimony Inference, we are given a boolean matrix and the goal is to find two other boolean 

matrices with a minimum number of columns such that an appropriately defined operation on these 
matrices gives back the input. We show that this is formally equivalent to the Bipartite Biclique 

I Edge Cover problem and derive some complexity results for our problem using this equivalence. We 

. provide a new, fixed-parameter tractability approach for solving both that slightly improves upon a 

' previously published algorithm for the Bipartite Biclique Edge Cover. Finally, we present experi- 

^"1 I mental results where we applied some of our techniques to a real-life data set. 

■ Keywords: Computational biology, biclique edge covering, bipartite graph, boolean matrix, NP- 

1 completeness, graph theory, fixed-parameter tractability, kernelization. 

(N ■ 

>• ■ 1 Introduction 

(N : 

I Wolbachia is a genus of inherited, intra-cellular bacteria that infect many arthropod species, includ- 

y—i ' ing a significant proportion of insects. The bacterium was first identified in 1924 by M. Hertig and 

^ ■ S. B. Wolbach in Culex pipiens, a species of mosquito. Wolbachia spreads by altering the reproduc- 

i five capabilities of its hosts [6] . One of these alterations consists in inducing so-called cytoplasmic 

' incompatibility [7]. This phenomenon, in its simplest expression, results in the death of embryos 

^ ■ produced in crosses between males carrying the infection and uninfected females. A more complex 

. y—( , pattern is the death of embryos seen in crosses between males and females carrying different Wol- 

^ ' bachia strains. The study of Wolbachia and cytoplasmic incompatibility is of interest due to the 

■ high incidence of such infections, amongst others in human disease vectors such as mosquitoes, 
where cytoplasmic incompatibility could potentially be used as a driver mechanism for the genetic 
manipulation of natural populations. 

The molecular mechanisms underlying cytoplasmic incompatibility are currently unknown, but 
the observations are consistent with a "toxin / antitoxin" model [16]. According to this model, the 
bacteria present in males modify the sperm (the so-called modification, or mod factor) by depositing 
a "toxin" during its maturation. Bacteria present in females, on the other hand, deposit an antitoxin 
(rescue, or resc factor) in the eggs, so that offsprings of infected females can develop normally. The 
simple compatibility patterns seen in several insect hosts species [1-3] has lead to the general view 
that cytoplasmic incompatibility relies on a single pair of mod / resc genes. However, more complex 
patterns, such as those seen in Table 1 of the mosquito Culex pipiens [5], suggest that this conclusion 
cannot be generalized. The aim of this paper is to provide a first model and algorithm to determine 
the minimum number of mod and resc genes required to explain a compatibility dataset for a given 



insect host. Such an algorithm will have an important impact on the understanding of the genetic 
architecture of cytoplasmic incompatibility. Beyond Wolbachia, the method proposed here can be 
applied to any parasitic bacteria inducing cytoplasmic incompatibility. 
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Fig. 1. The Culex pipiens dataset. Rows represent females and columns males. 

Let US now propose a formal description of this problem. Let the compatibility matrix C be an 
n-by-n matrix describing the observed cytoplasmic compatibility relationships among n Wolbachia 
strains, with females in rows and males in columns. For the Culex pipiens dataset, the content of 
the C matrix is directly given by Table 1. For each entry C^j- of this matrix, a value of 1 indicates 
that the cross between the i'th female and j'th male is incompatible, while a value of indicates 
it is compatible. No intermediate levels of incompatibility are observed in Culex pipiens, so that 
such a discrete code (0 or 1) is sufficient to describe the data. Let the mod matrix M be an n-by-k 
matrix, with n strains and k mod genes. For each Mi^j entry, a indicates that strain i does not 
carry gene j, and a 1 indicates that it does carry this gene. Similarly, the rescue matrix R is an 
n-by-A; matrix, with n strains and k resc genes, where Rij entries indicate whether strain i carries 
gene j. A cross between female i and male j is compatible only if strain i carries at least all the 
rescue genes matching the mod genes present in strain j. Using this rule, one can assess whether 
an (M, R) pair is a solution to the C matrix, that is, to the observed data. 

We can easily find non-parsimonious solutions to this problem, that is, large M and R matrices 
that are solutions to C, as will be proven in the next section. However, solutions may also exist with 
fewer mod and resc genes. We are interested in the minimum number of genes for which solutions 
to C exist, and the set of solutions for this minimum number. This problem can be summarized as 
follows: Let C (compatibility) be a boolean n-by-n matrix. A pair of n-by-fc boolean matrices M 
(mod) and R (resc) is called a solution to C if, for any row j in R and row i in M, Cjj- = if and 
only if Rj^i > M^^i holds for all ^, 1 < £ < k. This appropriately models the fact stated above that, 
for any cross to be compatible, the female must carry at least all the rescue genes matching the 
mod genes present in the male. For a given matrix C, we are interested in the minimum value of k 
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for which solutions to C exist, and the set of solutions for this minimum k. We refer to this problem 
as the Mod/Resc Parsimony Inference problem (see also Section 2). Since in come cases, data 
(on females or males) may be missing, the compatibility matrix C has dimension n-by-m for n not 
necessarily equal to m. We will consider this more general situation in what follows. 

In this paper, we present the Mod/Resc Parsimony Inference problem and prove it is equiv- 
alent to a well-studied graph-theoretic problem known in the literature by the name of Bipartite 
BiCLiQUE Edge Cover. In this problem, we are given a bipartite graph, and we want to cover its 
edges with a minimum number of complete bipartite subgraphs (bicliques) . This problem is known 
to be NP-complete, and thus Mod/Resc Parsimony Inference turns out to be NP-complete 
as well. In Section 4, wc investigate a previous fixed-parameter tractability approach [8] for solving 
the Bipartite Biclique Edge Cover problem and improve its algorithm. In addition, we show 
a reduction between this problem and the Clique Edge Cover problem. Finally, in Section 5, we 
present experimental results where we applied some of these techniques to the Culex pipiens data 
set presented in Table 1. This provided a surprising finding from a biological point of view. 

2 Problem Definition and Notation 

In this section, we briefly review some notation and terminology that will be used throughout the 

paper. We also give a precise mathematical definition of the Mod/Resc Parsimony Inference 
problem we study. For this, we first need to define a basic operation between two boolean vectors: 

Definition 1. The (8> vectors multiplication is an operation between two boolean vectors U,V G 



In other words, the result of the (8) multiplication is if, for all corresponding locations, the value 
in the second vector is not less than in the first. 

The reader should note that this operation is not symmetric. For example, if U := (0, 1, 1, 0) 
and V := (1,1,1,0), then (g) F = 0, while F (g) = 1. We next generalize the (g) multiplication to 

boolean matrices. This follows easily from the observation that the boolean vectors U,V {0, 1}*^ 
may be seen as matrices of dimension 1-by-fc. We thus use the same symbol (8) to denote the 
operation applied to matrices. 

Definition 2. The row-by-row matrix multiplication is a function {0, l}"^''^ x {0,1}™^'= ^ 
{0, 1}"^"" such that C = M ®R iff dj = M^® Rj for all i G {1, . . . ,n} and j G {1, . . . m}. 
(Here Mj and Rj respectively denote the i'th and j'th row of M and R.) 

Definition 3. In the Mod/Resc Parsimony Inference problem, the input is a boolean matrix 
C G {0, 1}"X"*^ and the goal is to find two boolean matrices M G {0, l}"^^*' and R G {0, l}'"^^*^ such 
that Cij = M ® R and with k minimal. 

We first need to prove there is always a correct solution to the Mod/Resc Inference Prob- 
lem. Here we show that there is always a solution for as many mod and resc genes as the minimum 
between the number of male and female strains in the dataset. 



{0, 1}*^ such that : 




1 




U[i] > V[i] for some i G {1, . . . ,k} 
otherwise 
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Lemma 1. The Mod/Resc Parsimony Inference problem always has a solution. 

Proof. A satisfying output for the Mod/Resc Parsimony Inference problem always exists for 
any possible C of size n-by-m. For instance, let M be of size n-by-n and equal to the identity 

— T 

matrix, and let R be of size m-hy-n and such that R = C . This solution is correct since the only 
1-value in an arbitrary row of the matrix M is at location M^. Thus, the only situation where 
Cij = 1 is when Rji = 0, which is the case by construction. □ 

Wc will be using some standard graph-theoretic terminology and notation. Wc use G, G' , and 
so forth to denote graphs in general, where V(G) denotes the vertex set of a graph G, and E{G) its 
edge-set. By a subgraph of G, we mean a graph G' with V{G') C V{G) and E{G') C E(G). For a 
bipartite graph G, i.e. a graph whose vertex-set can be partitioned into two classes with no edges 
occurring between vertices of the same class, we use Vi{G) and V2{G) to denote the two vertex 
classes of G. A complete bipartite graph (biclique) is a bipartite graph G with E{G) := {{u,v} : 
u G Vi{G),v G V2{G)}. We will sometimes use B, Bi, and so forth to denote bicliques. 

3 Equivalence to Bipcirtite Biclique Edge Cover 

In this section, we show that the Mod/Resc Parsimony Inference problem is equivalent to 
a classical and well-studied graph theoretical problem known in the literature as the Bipartite 

Graph Biclique Edge Cover problem. Using this equivalence, wc first derive the complexity 
status of Mod/Resc Parsimony Inference, and later devise FPT algorithms for this problem. 
We begin with a formal definition of the Bipartite Graph Biclique Edge Cover problem. 

Definition 4. In the Bipartite Biclique Edge Cover Problem problem, the input is a bi- 
partite graph G, and the goal is to find the minimum number of biclique subgraphs Bi,...,Bk of G 
such that E{G) := [jgE{Be). 

Given a bipartite graph G with Vi(G) := {ui, . . . ,Un} and V2(G) := {ui, . . . ,Um}, the bi- 
adjacency matrix of G is a boolean matrix A{G) G {0, l}"^"* defined by A{G)ij := 1 <J=> 
{ui,Vj} G E(G). In this way, every boolean matrix C corresponds to a bipartite graph, and vice 
versa. 

Theorem 1. Let C be a boolean matrix of size nxm. Then there are two matrices M G {0, 1}"^^*^ 
and R G {0, 1}™^'^ with C = M ® R iff the bipartite graph G with A{G) := C has a biclique edge 
cover with k bicliques. 

Proof. {^=) Let G be the bipartite graph with the bi-adjacency matrix C, and suppose G has 
biclique edge cover Bi, B2, ■ ■ ■ , B^. We construct two boolean matrices M and R as follows: Let 
Vi{G) := {-ui, . . . , Un} and V2{G) := {vi,..., Vm}- We define: 

1. Mi^e = 1 ^ uie Vi{Be). 

2. Rj^e = ^ vje V2{Be). 

An illustration of this construction is given in Figure 2. 

We argue that C = M ^ R. Consider an arbitrary location Cij = 1. By definition we have 
{ui,Vj} G E{G). Since the bicliques . . . cover all edges of G, we know that there is some 
£, £ e {1, . . . ,k}, with Ui G Vi{B£) and Vj G V2(-B^). By construction we know that Mj_^ = 1 and 
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Rj,e = 0) ^-iid so Mi (^Rj = 1, which means that the entry at row i and column j in M (JJ C is equal 
to 1. On the other hand, if Cij = 0, then {ui,Vj} ^ E(G), and thus there is no biclique with 
Ui S Vi{B£) and Vj £ V2{Bi). As a result, for all £ S {1, ... , k}, if Mj £ = 1 then Ri £ = 1 as well, 
which means that the result of the (8> multiplication between the i'th row in M and the j'th row in 
R will be equal to 0. 

(=^>) Assume there are two matrices M G {0, l}"^^'^ and R £ {0, l}™'^'^ with C = M ® R. 
Construct k subgraphs Bi, . . . , Bk of G, where the ^'th subgraph is defined as follows: 

1. UieViiBe) ^ Mi^e = l- 

2. Vj € V2iBi) ^ Rj^i = 0. 

3. {ui,Vj} G E{B^) ^ {vi,Vj} G E{G). 

We first argue that each of the subgraphs Bi, . . . ,Bk is a biclique. Consider an arbitrary sub- 
graph B^, and an arbitrary pair of vertices Ui G Vi{Bi) and VjV2[B(). By construction, it follows that 
Mi^l = 1 and Ri^i = 0. As a result, it must be that Cjj- = 1, which means that {ui,Vj} G E{G). Next, 
we argue that {j^E{B() = E{G). Consider an arbitrary edge {ui,Vj} G E{G). Since C = A{G), we 
have Cj.j = 1. Furthermore, since M ® R = G , there must be some £ G {1, . . . , /c} with Mj ^ > Rj/- 
However, this is exactly the condition for having ui and Vj in the biclique subgraph B^. It follows 
that indeed {j^E[B^) = E{G), and thus the theorem is proved. □ 

Due to the equivalence between Mod/Resc Parsimony Inference and Bipartite Biclique 
Edge Cover, we can infer from known complexity results regarding Bipartite Biclique Edge 
Cover the complexity of our problem. First, since Bipartite Biclique Edge Cover is well- 
known to be NP-complete [15], it follows that Mod/Resc Parsimony Inference is NP-complete 
as well. Furthermore, Gruber and Holzer [11] recently showed that Bipartite Biclique Edge 
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Cover problem cannot be approximated within a factor of n^/^-^ unless P = NP where n is 
the total number of vertices. Since the reduction given in Theorem 1 is clearly an approximate 
preserving reduction, we can deduce the following: 

Theorem 2. Mod/Resc Parsimony Inference is NP-complete, and furthermore, for alle > 0, 
the problem cannot be approximated within a factor of (n + m)^/^~^ unless P = NP. 

4 Fixed-pcirameter tractability 

In this section, we explore a parameterized complexity approach [4, 9, 14] for the Mod/Resc PAR- 
SIMONY Inference problem. Due to the equivalence shown in the previous section, we focus for 
convenience reasons on Bipartite Biclique Edge Cover. In parameterized complexity, problem 
instances arc appended with an additional parameter, usually denoted by k, and the goal is to find 
an algorithm for the given problem which runs in time f{k) ■ n'^^^\ where / is an arbitrary com- 
putable function. In our context, our goal is to determine whether a given input bipartite graph G 
with n vertices has a biclique edge cover of size k in time f{k) ■ n^^^\ 

4.1 The kerneUzation 

Fleischner et al. [8] studied the Bipartite Biclique Edge Cover problem in the context of 
parameterized complexity. The main result in their paper is to provide a kernel for the problem 
based on the techniques given by Gramm et al. [10] for the similar Clique Edge Cover problem. 
Kernelization is a central technique in parameterized complexity which is best described as a 
polynomial-time transformation that converts instances of arbitrary size to instances of a size 
bounded by the problem parameter (usually of the same problem), while mapping "yes" -instances 
to "ycs"-instances, and "no" -instances to "no" -instances. More precisely, a kernelization algorithm 
A for a parameterized problem (language) 77 is a polynomial-time algorithm such that there exists 
some computable function /, such that, given an instance (I, k) of U, A produces an instance 
(/', k') of n with: 

- \I'\ + k' < f{k), and 

- {i,k) e n ^ (/', k') e n. 

We refer the reader to e.g. [12, 14] for more information on kernelization. 

A typical kernelization algorithm works with reduction rules, which transform a given instance 
to a slightly smaller equivalent instance in polynomial time. The typical argument used when work- 
ing with reduction rules is that once none of these can be applied, the resultant instance has size 
bounded by a function of the parameter. For the Bipartite Biclique Edge Cover, two kernel- 
ization rules have been applied by Fleischner et al. [8]: 

RULE 1 : If G has a vertex with no neighbors, remove this vertex without changing the parameter. 
RULE 2 : If G has two vertices with identical neighbors, remove one of these vertices without chang- 
ing the parameter. 

Lemma 2 ( [8] ) . Applying rules 1 and 2 of above exhaustively gives a kernelization algorithm for 
Bipartite Biclique Edge Cover that runs in 0{rfi) time, and transforms an instance {G,k) 
to an equivalent instance {G' ,k) with \V{G')\ < 1^ and \E{G')\ < 2^^. 
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We add two additional rules, which will be necessary for further interesting properties. 
RULE 3 : If there is a vertex v with exactly one neighbor u in G, then remove both v and u, and 
decrease the parameter by one. 

Lemma 3. Rule 3 is correct. 

Proof. Assume a biclique cover of size k of the graph, and assume that vertex v is a member of 
some of the bicliques in this cover. By definition, at least one of the bicliques covers the edge {n, v}. 
Since this is the only edge adjacent to v, the bicliques that cover {u, v} include only vertex u among 
the vertices in its bipartite vertex class. If the bicliques do not cover all the edges of u, add them 
to each of the bicliques. □ 

RULE 4 : If there is a vertex u in G which is adjacent to all vertices in the opposite bipartition class 
of G, then remove v without decreasing the parameter. 

Lemma 4. Rule 4 is correct. 

Proof. After applying rule 3 above, each remaining vertex in the graph has at least two neighbors. 
Assume a biclique cover of size k of all the edges except those adjacent to vertex v. Assume w.l.o.g. 
that V G Vi{G). Since each vertex u G V2{G) has degree at least 2, it is adjacent to an edge which is 
covered by the biclique cover. It therefore belongs to some biclique in this cover. For each biclique 
in the cover, add now vertex v to its set of vertices. Since v is adjacent to all the vertices of V2{G), 
each changed component is a correct biclique and the new solution covers all the edges, including 
those of vertex v, and is of same size. □ 

Regarding the time complexity of the new rules we introduced, it is clear that once a vertex has 
been found in which a rule should be applied, applying each rule takes 0{n) time. Thus, including 
the time necessary to find such a vertex, the time required for each rule is 0{n). Since one can 

apply the reduction rules at most 0(n) time, the total time required for our extended kernelization 
remains O(n^). We remark that although the new rules do not change the kernelization size, which 
remains 2^' vertices in a solution of size k, they will be useful in the following section. 

4.2 Bipartite Biclique Edge Cover and Clique Edge Cover 

In this section, we show the connection between the Bipartite Biclique Edge Cover and the 
Clique Edge Cover problems. We show that in the context of fixed-parameter tractability, we 
can easily translate our problem to the classical clique covering problem and then use it for a 
solution to our problem. For instance, it gives another way for the kernelization of the problem and 
can provide interesting heuristics, mentioned in [10]. 

Given a kernelized bipartite graph G' as an instance to the Bipartite Biclique Edge Cover 
problem, we transform G' into a (non-bipartite) graph G" defined by V{G") := V{G') and E{G") := 
E{G') U {{u,v} :u,ve Vi{G') oi u,v e V2{G')}. 

Theorem 3. The edges of G' can he covered with k cliques iff the edges of G" can he covered with 
k + 2 cliques. 

Proof. Suppose Bi, . . . ,Bk is & bichque edge cover of G'. Then each V{Bi), i e {1, . . . ,k}, induces 
a clique in G" . Furthermore, the only remaining edges which are not covered in G" are the ones 
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between vertices in Vi{G') and V2{G'), which can be covered by the two chques induced by these 
vertex sets in G" . Altogether this gives us A; + 2 chques that cover aU edges in G" . Conversely, 
take a clique edge cover Ki, . . . , Kc of G". Due to the fourth kernelization rule, we know that 
there is no vertex in Vi{G') which is connected to all vertices in V2{G'), and vice- versa, in both 
G' and G" . It follows that there must be at least two cliques in {Ki, . . . , Kc}, say Ki and K2, 
with V{Ki) C Vi{G') and V{K2) C V2{G'). Thus, there is a subset of the cliques in {K3, . . . , K^} 
which have vertices in both partition classes of G', and which cover all the edges in G' . Taking the 
corresponding bicliques in G', and adding duplicated bicliques if necessary, gives us k bicliques that 
cover all edges in G'. □ 



4.3 Algorithms 

After the kernelization algorithm is applied, the next step is usually to solve the problem using 
brute- force. This is what is done in [8]. However, the time complexity given there is inaccurate, and 
the parametric-dependent time bound of their algorithm is 0{k^''2^'') = 0(2^^'' 'sfe+sfe-j instead of 
the 0{2'^^ +^'^) bound stated in their paper. Furthermore, the algorithm they describe is initially 
given for the related BIPARTITE BiCLiQUE Edge Partition problem (where each edge is allowed 
to appear exactly once in a biclique), and the adaptation of such algorithm to the Bipartite 
BiCLlQUE Edge Cover problem is left vague and imprecise. Here, we suggest two possible brute- 
force procedures for the Bipartite Biglique Edge Cover problem, each of which outperforms 
the algorithm of [8] in the worst-case. We assume throughout that we are working with a kernelized 
instance obtained by applying the algorithm described in Section 4.1, i.e. a pair {G',k) where G' 
is a bipartite graph with at most 2^^ vertices (and consequently at most 4*^ edges). 

The first brute-force algorithm: For each k' < k, try all possible partitions of the edge-set E{G') of 
G' into k' subsets. For each such partition 77 = {Ei, . . . , Ek'}, check whether each of the subgraphs 
G'[Ei], . . . ,G'[Ek'] is a biclique, where G'[Ei] is the subgraph of G induced by Ei. If yes, report 
G'[Ei], . . . ,G'[Ek'] as a solution. If some G'[Ei] is not a biclique, check whether edges in E{G') \ 
E{G[) can be added to in order to make the graph a biclique. Continue with the next partition 

if some graph in G'[Ei], . . . , G'[£'/j/] cannot be appended in this way in order to get a biclique, and 
otherwise report the solution found. Finally, if the above procedure fails for all partitions of E{G') 
into k' < k subsets, report that G' does not have a biclique edge cover of size k. 

Lemma 5. The above algorithm correctly determines whether G' has a bipartite biclique edge cover 
of size k in time ^ . 

Proof. Correctness of the above algorithm is immediate in case a solution is found. To see that 
the algorithm is also correct when it reports that no solution can be found, observe that for any 
biclique edge cover Bi, . . . , of G, the set {Ei, . . . , E/.} with Ei := E[G[) \ Uj<i E{G'j) defines a 
partition of E{G') (with some of the EiS possibly empty), and given this partition, the algorithm 
above would find the biclique edge cover of G' . Correctness of the algorithm thus follows. 

Regarding the time complexity, the time needed for appending edges to each subgraph is at most 
0{\{V{G')f\) = 0(2'^''), and thus a total of 0{2'^''k) = 0(22'^'+is^) time is required for the entire 
partition. The number of possible partitions of E[G') into k disjoint set is the Stirling number of the 

second kind S{2'^^ , k), which has been shown in [13] to be asymptotically equal to 0(^p = ^f— ■)• 

Thus, the total complexity of the algorithm is —. □ 
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The second brute-force algorithm: We generate the set /C(G') of all possible inclusion-wise maximal 
bicliques in G', and try all possible fc-subsets of IC{G') to see whether one covers all edges in 
G'. Correctness of the algorithm is immediate since one can always restrict oneself to using only 
inclusion- wise maximal bicliques in a biclique edge cover. To generate all maximal bicliques, we 
first transform G' into the graph G" given in Theorem 3. Thus, every inclusion-wise maximal 
biclique in G' is an inclusion-wise maximal clique in G" . We then use the algorithm of [18] on the 
complement graph G" of G", i.e. the graph defined by V{G") := V{G") and E[G") := {{u,v} : 
u,ve V(G^),u / V, and {u, v} ^ E{G")}. 

Theorem 4. The Bipartite Biclique Edge Cover problem can be solved in 0{f{k)+n^) time, 

where f{k) := 2^'^"''+^^. 

Proof. Given a bipartite graph G as an instance to Bipartite Biclique Edge Cover, we first 
apply the kernelization algorithm to obtain an equivalent graph G' with 2^ vertices, and then apply 
the brute-force algorithm described above to determine whether G' has a biclique edge cover of size 
k. Correctness of this algorithm follows directly from Section 4.1 and the correctness of the brute- 
force procedure. To analyze the time complexity of this algorithm, wc first note that Prisner showed 
that any bipartite graph on n vertices has at most 2"/^ inclusion-wise maximal bicliques [18]. This 
implies that |/C(G')I < 2^'"'. The algorithm of [17] runs in 0{\V{G')\\E{G')\\}C{G')\) time, which 
is 0(2*^22*^22'"') = 0(22'=-'+3'=). Finally, the total number of A;-subsets of /C(G') is 0(2^=2'-'), and 
checking whether each of these subsets covers the edges of G' requires 0{\V{G')\\E{G')\) = 0{2^^) 
time. Thus, the total time complexity of the entire algorithm is 0(2^ -|- 2^"^ -|- n^) = 

0(2'=2'=-'+3fe + n3). □ 

It is worthwhile mentioning that some particular bipartite graphs have a number of inclusion- 
wise maximal bicliques, which is polynomial in the number of their vertices. For these types of 
bipartite graphs, we could improve on the worst-case analysis given in the theorem above. For 
instance, a bipartite chordal graph G has at most \E{G)\ inclusion-wise maximal bicliques [18]. A 
bipartite graph with n vertices and no induced cocktail-party graph of order £ has at most v?^^~^'> 
inclusion-wise maximal bicliques [17]. The cocktail party graph of order i is the graph with nodes 
consisting of two rows of paired nodes in which all nodes but the paired ones are connected with 
a graph edge (for a full definition, see [17]). Observing that the algorithm in Section 4.1 preserves 
cordiality and does not introduce any new cocktail-party induced subgraphs, we obtain the following 
corollary: 

Corollary 1. The Bipartite Biclique Edge Cover problem, can be solved in 0{2'^^^~^^^ + n^) 
time when restricted to chordal bipartite graphs, and in 0(2?^ (^-i)+3fc _j_ ^^3^ time when restricted 
to bipartite graphs with no induced cocktail-party graphs of order t. 

5 Experimental Results 

We performed experiments of the parameterized algorithms on the Culex pipiens dataset, given in 
Table 1. We implemented the algorithms in the C-|— |- programming language, with source code of 
approximately 2500 lines. 

The main difficulty in practice is to find the minimal size k. Different approaches could be used. 
One would proceed by first checking if there is no solution of small sizes since this is easy to check 
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using the FPT approach, and then increasing the size until reaching a smallest size k for which one 
solution exists. Another would proceed by using different fast and efficient heuristics to discover 
a solution of a given size k' that in general will be greater than the optimal size k sought. Then 
applying dichotomy (the optimal solution is between 1 and k' — 1), the minimal size could be found 
using the FPT approach for the middle value between 1 and k' — 1, and so on. The source code 
and the results can be viewed on the webpage http://lbbe.univ-lyonl.fr/-Nor-Igor-.html. 

The result obtained on the Culex pipiens dataset indicates that 8 pairs of mod/resc genes are 
required to explain the dataset. This appear to be in sharp contrast to more simple patterns seen in 
other host species [2, 3, 1] that had led to the general belief that cytoplasmic incompatibility can be 
explained with a single pair of mod / rcsc genes. In biological terms, this result means that contrary 
to earlier beliefs, the number of genetic determinants of cytoplasmic incompatibility present in a 
single Wolbachia strain can be large, consistent with the view that it might involve repeated genetic 
elements such as transposable elements or phages. 
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Appendix 



Fig. 3. The Culex pipiens solution. 
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